Entry Name:  “CMICH-Li-MC2”

VAST Challenge 2015
Mini-Challenge 2

 

 

Team Members:

Ting Li, Central Michigan University, li2t@cmich.edu PRIMARY

Qi Liao, Central Michigan University, qi.liao@cmich.edu

 

Student Team:  NO

 

Did you use data from both mini-challenges?  NO

 

Analytic Tools Used:

The tool is called DinoFun World Communication Graph (DWCG).  DWCG is a visualization tool, which was mainly designed for this challenge; it is developed by graduate student Ting Li from Central Michigan University, under Dr. Qi Liao’s guidance. In this tool, we use nodes to represent each ID in the park, and edges between nodes represent communications, we divide each edge by time (10 segments) and use different colors to denote the locations where communication happens.

 

Approximately how many hours were spent working on this submission in total?

Around 200 hours.

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? YES

 

 

Video Download

Video:

  http://people.cst.cmich.edu/liao1q/video/CMICH-LI-MC2.wmv

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1 – Identify those IDs that stand out for their large volumes of communication.  For each of these IDs

 

      a.      Characterize the communication patterns you see.

      b.      Based on these patterns, what do you hypothesize about these IDs?

 

Limit your response to no more than 4 images and 300 words.

 

Based on DWCG, we sort all of the IDs which appear during the three days according to their communication volume in the tool’s left Node List. As we can see from Figure 1-1, 1278894, 839736, external (high light with red color) are the three IDs which stand out for their large volumes of communication.

1-1

Figure 1-1: During the whole period, IDs 1278894, 839736, external communicate more often with other IDs according to our tool’s Node List.

 

a.

1278894: its communications with other IDs are usually bidirectional; start at Entry Corridor around 12 pm in each day, and it will disappear around 9 pm in each day. Comparting to other IDs, its communications tends to be more often.  Figure 1-2 is a good example of this pattern, as we can see that on Friday, the communication between 1278894 (high light with red color) and 1061778 starts at the third segment (black color is the starting time, the third segment is around 12 pm) at Entry Corridor (purple color) and ends at the ninth segment which is near to 21:00 pm, the edge also has double arrows which indicates bidirectional, and 1061778 has more frequent communications with 1278894 comparing to other IDs (related edge has more colors than other edges).

1278894

Figure 1-2: 1278894’s communications with other IDs usually start at Entry Corridor at 12 pm and end around 9 pm.

 

839736: its communications with other IDs are usually also bidirectional, and the communications tend to start at different locations but end at Entry Corridor. Figure 1-3 is one of 839736’s communication examples, as we can that 839736’s (high light with red color) communications with other IDs tend to end at Entry Corridor (purple color), and are bidirectional.

839736

Figure 1-3: 839736’s communications with other IDs tend to end at Entry Corridor.

 

external: communications among external and other IDs are always unidirectional, external is always destination ID. The details are shown in Figure 1-4, we can see that external (high light with red color) is always destination ID, as there is only one arrow in external’s side on each edge.

external

Figure 1-4: Communications between external and other IDs are always unidirectional, external is always the destination ID.

 

b.

1278894: as its communications with other IDs always start at Entry Corridor, which indicates it always contact with other IDs first; and associate with raw data, we can know 1278894 is located at Entry Corridor. In addition, it only appears around 12 pm to 9 pm each day; we assume it is maybe a specific service in the park which located at the Entry Corridor, such as a check-in system.

 

839736: its communications with other IDs tend to start at different places but end at Entry Corridor, we can know that usually other IDs contact with 839736 first, and then 839736 replies at Entry Corridor, so we assume 839736 is maybe one of the park services, such as ride service or check-out system, which also locates at Entry Corridor.

 

external: as external is always the destination among all of the communications, we know only people in the park can communicate with people outside, people outside of the park cannot communicate people inside .

 

MC2.2 – Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.

 

Limit your response to no more than 10 images and 1000 words.

 

1. In the three days, 999764 only contacts with other IDs on Saturday, such as 203759, 2049559, and around 6pm to 8 pm (the sixth segment on the edge), it makes a lot of communications at Coaster Alley (pink color). The detailed graph is shown in Figure 2-1:

2-4

Figure 2-1: Around 6pm to 8 pm, 999764 which only appears on Saturday, makes a lot of communication in Coaster Alley.

 

2. During the three days, 1356570 only makes communications on Sunday, and around 12 pm, 1356570 communicates with other IDs, such as 1138286 and 1297289 a lot in Coaster Alley, which is shown in Figure 2-2: 1356570 (high light with red color) contacts with other IDs only on Sunday (edges are all gray until the eighth segment), and on the ninth segment (around 12 pm), it makes a lot of communications at Coaster Alley (pink color).

2-5

Figure 2-2: On Sunday around 12 pm, 1356570 makes a lot of communications with other IDs at Coaster Alley.

 

3. On Sunday, 1882328 (high light with red color) communicates pretty often with other IDs, such as 412874, 1160780, and its communications follow similar patterns. Namely, it tends to communicate a lot with other IDs at Coaster Alley (pink color) around 9 am to 12 pm (from first to third segment), and near to 8 pm (the eighth segment), it also makes a lot of communications at Coaster Alley, then it tends to communicate at Entry Corridor (purple color), Wet Land (green color), and again it comes back to Coaster Alley around 10 pm.

2-6

Figure 2-3: On Sunday, 1882328’s communications with other IDs follow a pattern: a lot of communications at Coaster Alley at 9 am to 12 pm, 8 pm and 10 pm.

 

4. During the three days, except for 1278894, external and 839736; 535028 only contacts with four IDs, they are 936244, 303603, 197811, and 437652. It contacts with 936244 in all the three days, contacts with 303603,197811,437652 only on Friday and Saturday. All of its communications with these four IDs start at Coaster Alley, and except for 936244, all end at Entry Corridor. These four IDs, contact with some same IDs during the whole period, such as 747874, 324556. The details are shown in Figure 2-4 (Since the patterns are the same for all three days, we use a representative day on Saturday for illustration purpose):

2-10

Figure 2-4: 535280 only contacts with seven IDs (highlight with red color) in the whole selected period, and except for 1278894, 839736 and external, the other four IDs communicate with a lot of same IDs in the three days.

 

5.  On Friday, in 12 pm, 952821 (high light with red color) tends to communicate with other IDs, such as 1143639 and 843401, at Coaster Alley (pink color) around 12:40 pm to 12:50 pm, and most of its communications end at Entry Corridor (purple color). The IDs which have communicated with 952821 will also communicate with 813540 (high light with red color too) during this period; these are shown in Figure 2-5:

2-9

Figure 2-5: 952821 tends to follow the Coaster Alley, Entry Corridor communication pattern during 12 pm on Friday, and the IDs which have communications with 952821 will also communicate with 813540 in this period.

 

6.  During the whole period, there are some IDs which only contact with each other and 1278894,839736, external.  As it is shown in Figure 2-6, they are 328117, 1660730, 1809394, and 520054. Among them, 328117 has almost the same communication patterns with 328117 to 1278894, they both communicate in all three days; and communication locations have the same change trend, such as both communicate from Coaster Alley to Wet Land to Tundra Land and to Coaster Alley again on Friday. 

2-2

Figure 2-6: 328117, 1660730, 1809394, and 520054 only communicate with each other and 1278894,839736, external during the whole three days.

 

7. During the three days, 1045438 (high light with red color) only communicates on Saturday. Most of its communications with other IDs, such as 1146036 and 1846909, start at Coaster Alley (pink color) around 11 am, then it goes to Entry Corridor (purple color) near to 1 pm, next it communicates at Coaster Alley again around 4 pm, and goes to Wet Land (green color) around 5 pm and again communicates at Coaster Alley at 9 pm, in the end, it tends to communicate at Wet Land.  This pattern can be found in Figure 2-7:

2-7

Figure 2-7: Most of 1045438 communications follow Coaster Alley, Entry Corridor, Coaster Alley, Wet Land, Coaster Alley and Wet Land pattern.

 

 8. IDs which communicate with 839736 tend to communicate with external too during the whole three days.  Graph shown in Figure 2-8 is a good example, 1806328, 1632966, 400873, 2002872, 317825, 261021, 386585, they communicate with 839736 and they also communicate with external. 386585, 1806328, 261021, 400873 all communicate with external and 839733 in three days and almost communicate at five different places, 1632966 communicates with 839736 and external on Saturday only, and 317825,2002872 both communicate with external and 839736 on Sunday. The details are shown in Figure 2-8:

2-8
Figure 2-8: IDs which communicate with 839736 also tend to communicate with external.

 

9. In the three days, 895998 only contacts with other IDs on Saturday around 3 pm to 9pm at Coaster Alley. These IDs are 2022258, 1314933, 589222, 839736, and external. The detailed graph is shown in Figure 2-9: during the whole period, 895998 only contacts with five other IDs at Coaster Alley (pink color) around the same segment. Entry Corridor (purple color) is the location where 839736 replies 895998.

2-3

Figure 2-9: in three days, 895998 only contacts with five IDs at Coaster Alley around 3 pm to 9 pm.

 

10. From Friday to Sunday, for the IDs which only communicate with other IDs once, they turn out to contact with external only. As we can see from Figure 2-10, these IDs are 1458915 contact at Wet Land (green color) on Saturday near 10 am,688489 at Wet Land on Sunday near 12 pm, 1680161 at Kiddie Land (yellow color) on Saturday near 11 am, 596672 at Tundra Land (blue color) on Saturday near 9 am , 1763672 at Kiddie Land on Friday near 9 am, 365259 at Wet Land on Sunday near 11 am, 474843 at Wet Land  on Sunday near 11 am, 1336870 at Wet Land on Friday near 3 pm, 215220 at Kiddie Land on Sunday near 10 am.

2-1

Figure 2-10: During the three days, only nine IDs contact with other IDs once, and they all only contact with external (high light with red color), these are shown in above graph.

 

MC2.3 From this data, can you hypothesize when the crime was discovered?  Describe your rationale.

 

Limit your response to no more than 3 images and 300 words. 

 

The crime was maybe discovered on Sunday around 17:40:00.

 

We learn from the provided background and data description that there were two shows each day in Park’s Grinosaurus Stage, and the crime happened at Creighton Pavilion, both of them belong to Coaster Alley.   So Coaster Alley is the location that we should pay more attention to while analyzing the data;

 

Based on the provided MC2 data, we calculate each hour’s communication amount in each day respectively, and we get the following line chart:

Figure 3-1: Communication amount of Friday (red line), Saturday (green line) and Sunday (blue line) in each hour.

 

From the Figure 3-1, we can see that the three days’ communication amount in each hour almost have the similar tendency, and around 12 pm and 6 pm, the communication amounts are all increasing in the three days and have a comparatively high values comparing to other time (high light with yellow color).

 

All of the above information is a starting point of our analysis, and based on these we get the patterns in MC2.2.

 

From the patterns in MC2.2, especially the first, second and third patterns, along with the pattern shown in Figure 3-2, from which we can see that 825466 tends to communicate with other IDs a lot in the fifth segment (refers to around 12 pm) at Coaster Alley (pink color), we know that usually during 12 pm and around 6 pm to 8 pm each day, there tends to be a lot of communications happened at Coaster Alley, this is consistent with the information that Figure 3-1 provides for us, hence we assume that 12 and 6 pm maybe are the show time in each day.

3-2

Figure 3-2: On Friday around 12 pm, 825466 makes a lot of communications with other IDs at Coaster Alley.

 

From the third pattern in MC2.2 and the pattern shown in Figure 3-3, which both indicate on Sunday near to 8 pm, some IDs, such as 1882328 and 195725 make a lot of communications at Coaster Alley, then they tend to communicate at Entry Corridor, Wet Land, Tundra Land, and again they come back to Coaster Alley around 10 pm, this is also consistent to the information provided in the news, which stated that the park was closed once the vandalism was discovered, so maybe around 8 pm the crime was discovered and 8 pm to 10 pm is the park closing time. 

3-3

Figure 3-3: On Sunday, around 9am to 12pm (first to third segments), 195725 makes a lot of communications at Coaster Alley (pink color), and near to 8 pm (eighth segment) it again makes communications often at Coaster Alley, then it goes to Tundra Land, Entry Corridor and around 10 pm(tenth segment),  it comes back to Coaster Alley again.

 

After knowing the possible time period, we referred to the raw data; we found that the more precise crime discovery time is around 17:40:00, as the eighth segment refers to near 17:40:00 in the raw data.